Goto

Collaborating Authors

 synthetic personae


Whose Personae? Synthetic Persona Experiments in LLM Research and Pathways to Transparency

Batzner, Jan, Stocker, Volker, Tang, Bingjun, Natarajan, Anusha, Chen, Qinhao, Schmid, Stefan, Kasneci, Gjergji

arXiv.org Artificial Intelligence

Synthetic personae experiments have become a prominent method in Large Language Model alignment research, yet the representativeness and ecological validity of these personae vary considerably between studies. Through a review of 63 peer-reviewed studies published between 2023 and 2025 in leading NLP and AI venues, we reveal a critical gap: task and population of interest are often underspecified in persona-based experiments, despite personalization being fundamentally dependent on these criteria. Our analysis shows substantial differences in user representation, with most studies focusing on limited sociodemographic attributes and only 35% discussing the representativeness of their LLM personae. Based on our findings, we introduce a persona transparency checklist that emphasizes representative sampling, explicit grounding in empirical data, and enhanced ecological validity. Our work provides both a comprehensive assessment of current practices and practical guidelines to improve the rigor and ecological validity of persona-based evaluations in language model alignment research.


Concerns on Bias in Large Language Models when Creating Synthetic Personae

Haxvig, Helena A.

arXiv.org Artificial Intelligence

One immense concern relates to the existence of bias in the models, and creating synthetic personae has the potential to aid the investigation of how different forms of bias manifest in LLMs, by introducing a new method of testing. However, the black-box nature of a majority of these models, and their inability to express'opinions' contrary to overall LLM rules or fail-safes, introduces complexities in how to prompt the models to act out specific synthetic personae in various scenarios. This position paper introduces an exploration of a few fundamental questions: What are the benefits and drawbacks of using synthetic personae in HCI research, and how can we customize them beyond the limitations of current LLMs? The perspectives presented in this paper have sprung from the sub-study of a PhD project on Artificial Intelligence and Participatory Design [18]. The sub-study, currently a work in progress, aims at developing a novel method of adversarial testing [6, 13, 21] through the use of contextualized"real-life" vignettes [2, 16] prompted to the interfaces of multiple LLMs to identify potential bias, trying to open up the"black box" from a more qualitative human-computer interaction perspective[10]. 2 BIAS DETECTION IN LLM INTERFACES Research in various sub-fields has shown that human engagement in AI design, development, and evaluation, particularly in a qualitative manner, can ensure a focus on the socio-technical embeddedness of AI [3].


Exploring Augmentation and Cognitive Strategies for AI based Synthetic Personae

Gonzalez, Rafael Arias, DiPaola, Steve

arXiv.org Artificial Intelligence

Large language models (LLMs) hold potential for innovative HCI research, including the creation of synthetic personae. However, their black-box nature and propensity for hallucinations pose challenges. To address these limitations, this position paper advocates for using LLMs as data augmentation systems rather than zero-shot generators. We further propose the development of robust cognitive and memory frameworks to guide LLM responses. Initial explorations suggest that data enrichment, episodic memory, and self-reflection techniques can improve the reliability of synthetic personae and open up new avenues for HCI research.